Search CORE

1,142 research outputs found

Histogram of Oriented Principal Components for Cross-View Action Recognition

Author: Huynh Du
Mahmood Arif
Mian Ajmal
Rahmani Hossein
Publication venue
Publication date: 03/09/2015
Field of study

Existing techniques for 3D action recognition are sensitive to viewpoint variations because they extract features from depth images which are viewpoint dependent. In contrast, we directly process pointclouds for cross-view action recognition from unknown and unseen views. We propose the Histogram of Oriented Principal Components (HOPC) descriptor that is robust to noise, viewpoint, scale and action speed variations. At a 3D point, HOPC is computed by projecting the three scaled eigenvectors of the pointcloud within its local spatio-temporal support volume onto the vertices of a regular dodecahedron. HOPC is also used for the detection of Spatio-Temporal Keypoints (STK) in 3D pointcloud sequences so that view-invariant STK descriptors (or Local HOPC descriptors) at these key locations only are used for action recognition. We also propose a global descriptor computed from the normalized spatio-temporal distribution of STKs in 4-D, which we refer to as STK-D. We have evaluated the performance of our proposed descriptors against nine existing techniques on two cross-view and three single-view human action recognition datasets. The Experimental results show that our techniques provide significant improvement over state-of-the-art methods

arXiv.org e-Print Archive

CiteSeerX

Lancaster E-Prints

Action Classification with Locality-constrained Linear Coding

Author: Huynh Du
Mahmood Arif
Mian Ajmal
Rahmani Hossein
Publication venue
Publication date: 01/01/2014
Field of study

We propose an action classification algorithm which uses Locality-constrained Linear Coding (LLC) to capture discriminative information of human body variations in each spatiotemporal subsequence of a video sequence. Our proposed method divides the input video into equally spaced overlapping spatiotemporal subsequences, each of which is decomposed into blocks and then cells. We use the Histogram of Oriented Gradient (HOG3D) feature to encode the information in each cell. We justify the use of LLC for encoding the block descriptor by demonstrating its superiority over Sparse Coding (SC). Our sequence descriptor is obtained via a logistic regression classifier with L2 regularization. We evaluate and compare our algorithm with ten state-of-the-art algorithms on five benchmark datasets. Experimental results show that, on average, our algorithm gives better accuracy than these ten algorithms.Comment: ICPR 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

Lancaster E-Prints

Theoretical Design and Analysis of Multivolume Digital Assays with Wide Dynamic Range Validated Experimentally with Microfluidic Digital PCR

Author: Du Wenbin
Huynh Toan
Ismagilov Rustem F.
Kreutz Jason E.
Munson Todd
Shen Feng
Publication venue: 'American Chemical Society (ACS)'
Publication date: 01/11/2011
Field of study

This paper presents a protocol using theoretical methods and free software to design and analyze multivolume digital PCR (MV digital PCR) devices; the theory and software are also applicable to design and analysis of dilution series in digital PCR. MV digital PCR minimizes the total number of wells required for “digital” (single molecule) measurements while maintaining high dynamic range and high resolution. In some examples, multivolume designs with fewer than 200 total wells are predicted to provide dynamic range with 5-fold resolution similar to that of single-volume designs requiring 12 000 wells. Mathematical techniques were utilized and expanded to maximize the information obtained from each experiment and to quantify performance of devices and were experimentally validated using the SlipChip platform. MV digital PCR was demonstrated to perform reliably, and results from wells of different volumes agreed with one another. No artifacts due to different surface-to-volume ratios were observed, and single molecule amplification in volumes ranging from 1 to 125 nL was self-consistent. The device presented here was designed to meet the testing requirements for measuring clinically relevant levels of HIV viral load at the point-of-care (in plasma, 1 000 000 molecules/mL), and the predicted resolution and dynamic range was experimentally validated using a control sequence of DNA. This approach simplifies digital PCR experiments, saves space, and thus enables multiplexing using separate areas for each sample on one chip, and facilitates the development of new high-performance diagnostic tools for resource-limited applications. The theory and software presented here are general and are applicable to designing and analyzing other digital analytical platforms including digital immunoassays and digital bacterial analysis. It is not limited to SlipChip and could also be useful for the design of systems on platforms including valve-based and droplet-based platforms. In a separate publication by Shen et al. (J. Am. Chem. Soc., 2011, DOI: 10.1021/ja2060116), this approach is used to design and test digital RT-PCR devices for quantifying RNA

Caltech Authors

#REVAL: a semantic evaluation framework for hashtag recommendation

Author: Alsini Areej
Datta Amitava
Huynh Du Q.
Publication venue
Publication date: 24/05/2023
Field of study

Automatic evaluation of hashtag recommendation models is a fundamental task in many online social network systems. In the traditional evaluation method, the recommended hashtags from an algorithm are firstly compared with the ground truth hashtags for exact correspondences. The number of exact matches is then used to calculate the hit rate, hit ratio, precision, recall, or F1-score. This way of evaluating hashtag similarities is inadequate as it ignores the semantic correlation between the recommended and ground truth hashtags. To tackle this problem, we propose a novel semantic evaluation framework for hashtag recommendation, called #REval. This framework includes an internal module referred to as BERTag, which automatically learns the hashtag embeddings. We investigate on how the #REval framework performs under different word embedding methods and different numbers of synonyms and hashtags in the recommendation using our proposed #REval-hit-ratio measure. Our experiments of the proposed framework on three large datasets show that #REval gave more meaningful hashtag synonyms for hashtag recommendation evaluation. Our analysis also highlights the sensitivity of the framework to the word embedding technique, with #REval based on BERTag more superior over #REval based on FastText and Word2Vec.Comment: 18 pages, 4 figure

arXiv.org e-Print Archive

A Comparative Review of Recent Kinect-based Action Recognition Algorithms

Author: Huynh Du Q.
Koniusz Piotr
Wang Lei
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 24/06/2019
Field of study

Video-based human action recognition is currently one of the most active research areas in computer vision. Various research studies indicate that the performance of action recognition is highly dependent on the type of features being extracted and how the actions are represented. Since the release of the Kinect camera, a large number of Kinect-based human action recognition techniques have been proposed in the literature. However, there still does not exist a thorough comparison of these Kinect-based techniques under the grouping of feature types, such as handcrafted versus deep learning features and depth-based versus skeleton-based features. In this paper, we analyze and compare ten recent Kinect-based algorithms for both cross-subject action recognition and cross-view action recognition using six benchmark datasets. In addition, we have implemented and improved some of these techniques and included their variants in the comparison. Our experiments show that the majority of methods perform better on cross-subject action recognition than cross-view action recognition, that skeleton-based features are more robust for cross-view recognition than depth-based features, and that deep learning features are suitable for large datasets.Comment: Accepted by the IEEE Transactions on Image Processin

arXiv.org e-Print Archive

Crossref

Recommended from our members

The pattern of foreign property investment in Vietnam: the apartment market in Ho Chi Minh City

Author: Huynh Du The
Jung Sanghoon
Rowe Peter G.
Publication venue: 'Elsevier BV'
Publication date: 06/06/2013
Field of study

As globalization proceeds, transnational property development is increasing. Especially in emerging markets, foreign developers’ influence in changing the local landscape is becoming significant. In this research, the behavioral patterns of foreign developers in the apartment market of Ho Chi Minh City, Vietnam were identified. To understand the dynamics of foreign developers, the types of products that were being created, where the investments were located, and the differences in development strategies adopted by foreign developers in comparison to domestic counterparts were identified. To accomplish this, data on apartment projects and statistics were collected, and a series of spatial analyses including sieve mapping, histogram analysis, factor analysis and logistic regression was conducted. In addition, closer examination was made of specific cases to understand the dynamics among foreign and domestic developers, also allowing the identification of some regularities in the patterns of foreign developments. Besides presenting detailed results, this paper also seeks to account for the conditions that appear to have generated these patterns and characteristics

Harvard University - DASH

Hallucinating IDT Descriptors and I3D Optical Flow Features for Action Recognition with CNNs

Author: Huynh Du Q.
Koniusz Piotr
Wang Lei
Publication venue
Publication date: 18/08/2019
Field of study

In this paper, we revive the use of old-fashioned handcrafted video representations for action recognition and put new life into these techniques via a CNN-based hallucination step. Despite of the use of RGB and optical flow frames, the I3D model (amongst others) thrives on combining its output with the Improved Dense Trajectory (IDT) and extracted with its low-level video descriptors encoded via Bag-of-Words (BoW) and Fisher Vectors (FV). Such a fusion of CNNs and handcrafted representations is time-consuming due to pre-processing, descriptor extraction, encoding and tuning parameters. Thus, we propose an end-to-end trainable network with streams which learn the IDT-based BoW/FV representations at the training stage and are simple to integrate with the I3D model. Specifically, each stream takes I3D feature maps ahead of the last 1D conv. layer and learns to `translate' these maps to BoW/FV representations. Thus, our model can hallucinate and use such synthesized BoW/FV representations at the testing stage. We show that even features of the entire I3D optical flow stream can be hallucinated thus simplifying the pipeline. Our model saves 20-55h of computations and yields state-of-the-art results on four publicly available datasets.Comment: First two authors contributed equally. This paper is accepted by ICCV'1

arXiv.org e-Print Archive

Crossref

Shipbuilding cluster in the Republic of Korea

Author: Chen John
Galstyan Martin
Huynh Du et al.
Publication venue: [Boston]
Publication date: 01/01/2010
Field of study

K-Developedia(KDI School) Repository